The TopCoder component development process places a huge emphasis on unit testing the components to ensure that they function correctly – and don’t fail in unexpected ways – even in the face of extreme loads and invalid inputs. Well-written unit tests by developers and reviewers achieve this goal rather well, but what happens when the test suite misses testing a crucial method or statement? How can the development process itself be improved to avoid such problems? Can we incorporate methods to measure the extensiveness of our test suite? The answer to these questions is “yes, we can,” thanks to code coverage.
Code coverage is all about measuring how well your tests cover our code. It’s about finding out which parts of the code base are executed by the test suite and, therefore, which parts are not. A line of code not executed during testing is a potential location for bugs that the tests didn’t find. Armed with this knowledge, we can write more tests which cover these lines, thus increasing the quality of the code base. In essence, code coverage helps us increase and verify the quality of the code base by helping us improve the quality of the test suite.
There are various ways to measure coverage - not all of them are supported by every coverage tool. Although an explanation of each coverage measure will be beyond the scope of this article, we’ll take a look at some of the most common ones below, along with some examples.
This measure tells us whether each method in our code base is invoked by our test cases or not. It is one of the most basic measures of coverage and helps us to identify the biggest shortcomings in our test suite.
Line coverage, also called statement coverage, tells us about each line that is executed during the run. This form of coverage is very easy to associate with individual source code lines because the developer is immediately informed about the lines that are not executed. Unfortunately, line coverage can miss cases in which the logic may be at fault, even though each line of code gets executed. For example, look at the following piece of code:
1 2 3 4 5 6 7 8 9
public String lineCoverageExample(boolean condition) { String result = null; if (condition) { result = getCustomer() .getAddress(); return result.trim(); }
Here, if all the tests always call the method with condition as true, the code is reported to be fully covered (though it would throw a NullPointerException when condition is false). Apart from the case in the above example, line coverage also does not report whether for loops reach their termination conditions or whether they are terminated forcefully by break statements. It also does not distinguish between the various cases in a switch-case statement block.
Instead of measuring individual lines, basic block coverage considers each sequence of non-branching statement as a unit of code. This means that, for an if statement, this measure will report 100% coverage only when both the then and the else, blocks get executed. For example, consider the following piece of code:
1 2 3 4 5 6 7 8 9 10 11 12
public void basicBlockCoverageExample(boolean condition) { if (condition) { System.out.println("This block contains only one line."); } else { System.out.println("This block contains multiple lines. This is the first."); System.out.println("This is the second."); System.out.println("This is the third."); System.out.println("This is the fourth."); System.out.println("This is the fifth."); System.out.println("This is the sixth."); } }
In the above example, if we always pass false as the parameter, line coverage will be close to 90% – giving the illusion of good coverage despite missing a whole branch. In comparison, basic block coverage will report coverage of only 50% for this example.
Some tools support partial line coverage, meaning that they project basic block coverage onto the source code line structure so that single lines with multiple blocks are fully exercised by the tests.
Also known as branch coverage, this tells us whether the expressions in condition statements evaluate both to true and false during testing. With this measure we can find tests which test if-then-else blocks only one way, i.e. tests that exercise only the then or the else but not both. The problem with this approach is that decision coverage considers the expression as a whole and cannot detect whether individual conditions within the expression have been evaluated both ways or not. For example, it cannot detect whether a particular branch in the boolean expression was short-circuited or not, as in the case of a logical OR expression.
1 2 3 4 5 6 7 8 9
public void decisionCoverageExample(boolean first, boolean second) { if (first && (second || anotherCondition())) { //some code here } else { //some code here } }
In this example, if the tests call the method with the first parameter as true as well as false and the second parameter with true, then decision coverage will report the above code as fully covered even if anotherCondition() is never evaluated.
This is a variant of branch coverage that checks whether each sub-expression in a boolean expression evaluates to both true and false during testing. There is also a variant of condition coverage called multiple condition coverage, which takes the whole expression and all the sub-expressions into consideration. It tells us whether every possible combination of the sub-expressions is evaluated during testing or not. As you can imagine, the number of tests needed for full coverage with this approach increases with the complexity of the boolean expressions involved. However, this isn’t always predictable – two boolean expressions with a similar number of sub-expressions can require a very different number of test cases for full coverage.
One note: Full condition coverage does not imply full decision coverage, because condition coverage does not consider the expression as a whole.
Path coverage tells us whether each possible path from the method entry to the return is executed or not. For this purpose, a “path” is defined as a unique sequence of branches from the method entry to the exit, with “exit” defined as either successful completion of the method or a throw statement. The problem with this approach is that there are cases when executing all theoretically possible paths is practically impossible. For example, consider the following piece of code:
1
2
3
4
5
6
7
8
9
10
11
12
13
public void pathCoverageExample(boolean condition) {
if (condition) {
//Path 1
} else {
//Path 2
}
if (condition) {
//Path 3
} else {
//Path 4
}
}
In the above example, the number of possible paths is theoretically four but practically only two paths are possible which are {1, 3} and {2, 4}. Paths {1, 4} and {2, 3} can never be executed by any test case. Needless to say, full path coverage is very difficult to measure and achieve.
There are a variety of coverage tools on the market now. Most of them use one of the following techniques to measure coverage:
Instrumentation is the process of inserting trace statements into the code that can be used for profiling purposes. As the name suggests, source instrumentation-based tools work by inserting custom trace calls into the source code of the application, re-compiling it, and then running the test suite over the modified application. One disadvantage with this approach is that double-compilation may slow the process, particularly for large projects. For examples, Clover and Jester work this way.
Tools based on this approach work by inserting instrumentation instructions directly into the compiled files (bytecode and IL) rather than the source code. This has two advantages: it makes the implementation of these tools easier, since bytecode is much more tidy than source code; and it makes them faster, since instrumenting bytecode can usually be done more quickly than instrumenting source code. Instrumenting compiled code can be done either by modifying the compiled files (as a second compilation step) or by using a custom class loader. The open source tools EMMA and Cobertura use the former approach. Hansel, a JUnit extension, uses custom class loaders in combination with explicit instrumentation, through which the developer encapsulates JUnit tests in Hansel’s wrapper classes. Through this approach, Hansel measures the specific coverage of individual tests, instead of reporting coverage over the whole code base.
This kind of tool uses custom virtual machines to profile the code during testing. The virtual machine itself keeps track of the executed portions of the classes that are loaded during testing. These tools use APIs such as JVMPI, JVMDI, or JVMTI to measure coverage. An example of this approach is Java JVMDI Coverage Tool.
There are also several other coverage measures that this article does not cover, such as loop coverage, race coverage, linear code sequence and jump (LCSAJ) coverage. There are also related methods, such as Mutation Testing and Residual Test Coverage, which you may want to explore.
So, how much coverage is enough? This question is open to debate. Purists believe that coverage should be nothing short of 100%. Achieving 100% code coverage is not an easy task, however, and as we’ve seen it is no guarantee that the code base is entirely free of bugs. In general, though, as the percentage of code coverage rises we can feel increasingly confident that the code base is less likely to contain bugs. Many experts cite 80% as a good threshold for code coverage - at that level, you get a significant improvement in the quality of the code base without having to go through all the extra work it would take to get from 80% to 100%.
Wherever TopCoder chooses to set the bar, though, I believe code coverage will help. As kyky remarked in this forum post, “the chances of improving something are zero unless you measure it.” Measuring coverage for TopCoder components will definitely make developers more aware of the extensiveness of their unit tests, and will improve the overall quality of the components.
One final note: In this article, I’ve covered the what of code coverage and deliberately left out the how, which will be the topic of a future article. You’re welcome to share your experience and opinions of code coverage in the forums, and let me know if there is anything I’ve overlooked. Thank you!